Which percentile-based approach should be preferred for calculating normalized citation impact values? An empirical comparison of five approaches including a newly developed citation-rank approach (P100)

نویسندگان

  • Lutz Bornmann
  • Loet Leydesdorff
  • Jian Wang
چکیده

s, and editorial materials. Taken as a whole, we only excluded 7.7% (n=19) subject categories (at the subject category level) and 24.1% (151,272) papers (at the paper level). As a 2 Notes were removed from the database as a document type in 1997, but they were citable items in 1980. 12 result, we kept 475,391 papers for the analysis, and the annual citation counts (from 1980 to 2010) of these papers were retrieved from WoS. The reference sets were used to calculate values of P100 and the following four percentile-based approaches: the approaches developed by Hazen, Thomson Reuters (InCites), SCImago, and CWTS. Each paper in WoS is classified into one unique document type but possibly into multiple subject categories. Therefore, for papers with multiple subject categories, different aggregation rules were implemented to construct a unique value (rank) for each paper. The average (percentile) rank is used for the approaches of Hazen, SCImago, and P100, and the best performance across reference sets is used for the approach used for InCites. When using the CWTS approach based on fractionation, the average probability of belonging to a specific group is considered as a fraction. Specifically, for any pre-specified (percentile) rank class (e.g., the top x% highly cited papers), each paper gets a number p indicating its probability of belonging to this top x% class (0≤p≤1). However, a paper can get different values for p in multiple reference sets, and the average value of p is used to construct a unique indicator for this paper (Ludo Waltman, personal communication, 5/30/2013). Furthermore, the citation impact values (percentiles or ranks) could be too discrete if the size of the reference set becomes too small. Therefore, only reference sets with at least hundred papers are included. 3 For example, if a paper belongs to two different reference sets: A and B, and A has more than 100 papers while B has less than 100 papers, then the impact values based on B are discarded. If neither A nor B has more than 99 papers, then both results based on A and B are discarded, and this paper is excluded from the further analysis. As the reference sets are defined not only by subject category, but also by document type, it is likely that the number of subject categories for reviews and notes is reduced more significantly than 3 We decided to use 100 papers as a limit to produce reliable data. There is a high probability that the use of a limit of 50 or 200 would come to similar results as ours. 13 for articles. Reviews and notes are less frequent than articles and therefore are more likely to be excluded by the restriction of requiring at least 100 papers in the reference set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Is the new citation-rank approach P100′ in bibliometrics really new?

The percentile-based rating scale P100 describes the citation impact in terms of the distribution of unique citation values. This approach has recently been refined by considering also the frequency of papers with the same citation counts. Here I compare the resulting P100’ with P100 for an empirical dataset and a simple fictitious model dataset. It is shown that P100’ is not much different fro...

متن کامل

Examples for counterintuitive behavior of the new citation-rank indicator P100 for bibliometric evaluations

A new percentile-based rating scale P100 has recently been proposed to describe the citation impact in terms of the distribution of the unique citation values. Here I investigate P100 for 5 example datasets, two simple fictitious models and three larger empirical samples. Counterintuitive behavior is demonstrated in the model datasets, pointing to difficulties when the evolution with time of th...

متن کامل

From P100 to P100_: Conception and improvement of a new citation-rank approach in bibliometrics

Properties of a percentile-based rating scale needed in bibliometrics are formulated. Based on these properties, P100 was recently introduced as a new citation-rank approach (Bornmann, Leydesdorff,&Wang, in press). In this paper, we conceptualize P100 and propose an improvement which we call P100_. Advantages and disadvantages of citation-rank indicators are noted.

متن کامل

How to improve the outcome of performance evaluations in terms of percentiles for citation frequencies of my papers

Using empirical data I demonstrate that the result of performance evaluations by percentiles can be drastically influenced by the proper choice of the journal in which a manuscript is published. 1. Introduction In order to evaluate the impact of publications in terms of citation counts recently percentile-based bibliometric indicators have gained more and more attention (Bornmann, Leydesdorff, ...

متن کامل

Turning the tables in citation analysis one more time: Principles for comparing sets of documents by using an “Integrated Impact Indicator” (I3)

We submit newly developed citation impact indicators based not on arithmetic averages of citations but on percentile ranks. Citation distributions are—as a rule—highly skewed and should not be arithmetically averaged. With percentile ranks, the citation of each paper is rated in terms of its percentile in the citation distribution. The percentile ranks approach allows for the formulation of a m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Informetrics

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2013